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Abstract 

Records of social interactions provide us with new sources of data for understanding 
how interaction patterns affect collective dynamics. Such human activity patterns are 
often bursty, i.e., they consist of short periods of intense activity followed by long periods 
of silence. This burstiness has been shown to affect spreading phenomena; it accelerates 
epidemic spreading in some cases and slows it down in other cases. We investigate a model 
of history-dependent contagion. In our model, repeated interactions between susceptible 
and infected individuals in a short period of time is needed for a susceptible individual to 
contract infection. We carry out numerical simulations on real temporal network data to 
find that bursty activity patterns facilitate epidemic spreading in our model. 
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1 Introduction 



Communication between individuals is a fundament of human society. Nowadays technologies 
such as sensor devices and online communication services provide us with records of interaction 
between individuals, including face-to-face conversations, e-mail exchanges, and phone calls, in 
massive amounts. Such data often consist of a sequence of interaction events. Each event is 
represented by a triplet, i.e., the IDs of two individuals involved in the event and the time of 
the event. One traditional way to characterize such data is to represent them as an aggregated 
network, in which the links are drawn between two nodes [i.e., individuals) that communicate 
in at least one event, and investigate structural properties of the aggregated static networks [1]. 
Another and richer representation of this type of data is to model them as temporal networks, 
in which the links between two nodes exist only at the time of an event [2]. 

Effects of temporal networks on contagious phenomena, such as infectious diseases and 
rumors, have been investigated by various authors. To simulate spreading dynamics on temporal 
networks, we read the events in an empirical event sequence one by one in the chronological 
order and possibly update the states {e.g., susceptible and infected) of the two nodes involved 
in the event. Karsai and colleagues simulated the susceptible-infected (SI) model on temporal 
networks and found that bursty activity patterns slow down contagions [3]; Bursty activity 
patterns are identified with a long-tailed distribution of the interevent intervals (IEIs) [4,5]. 
The slowing down occurs because, at an arbitrary time point, the average time to the next event 
is longer for the long-tailed IEI distribution than for the exponential IEI distribution with the 
same mean. In other words, after an individual gets infected, it tends to take longer time to 
infect the neighbors under the long-tailed as compared to exponential IEI distribution. Other 
numerical [6, 7] and analytical [8-10] results also support that the long-tailed IEI distribution 
mitigates contagion. However, the burstiness was reported to accelerate contagion on a different 
data set [11] and a different type of epidemic dynamics [12]. Our understanding of the effect 
of the burstiness on contagious processes is still elusive. 
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In the present study, we show that bursty activity patterns facilitate epidemic spreading in a 
variant of the deterministic threshold model [13, 14]. In standard models of epidemics including 
the SI, susceptible-infected- recovered (SIR), and susceptible-exposed- infected-recovered (SEIR) 
models, which have been employed in the literature cited above, a susceptible node gets infected 
from an infected neighbor with a constant probability in an event, regardless of the amount of 
exposure to infected neighbors in the past. However, history-dependent thresholding effects in 
which the thresholding operates on the concentration of the pathogen have been reported for 
some infectious diseases mediated by bacteria, such as the tuberculosis and the dysentery [15]. 
In the case of information propagation, the exposure to the information increases one's interest 
in a topic, and the attractiveness of a topic decays in time in the absence of stimulus [16, 17]. 
We may need multiple interactions to persuade others to do something, and repeated contacts 
in a short period can be more effective than those dispersed over a long period. To consider 
this type of infection, we generalize the deterministic threshold model to the case of history 
dependence and memory decay and simulate the proposed model on temporal network data. 

2 Methods 

Each node i is assumed to have an internal variable denoted by Vi > (z = 1, 2, . . . , N), which 
represents, for example, the concentration of a pathogen in the individual or the individual's 
interest in a topic. Initially, V\ to equal to zero for all i. We assume that node i is in the 
susceptible (S) state before exceeds a threshold value t> t h r and that node i is in the infected 
(/) state once Vi exceeds f t hr- Each node is in either state. Nodes in state I never return to state 
S; our model is an extension of the SI model. Therefore, the number of / nodes monotonically 
increases in time. 

When node i in state S interacts with an / node through an event, Vi is increased by unity. 
In the absence of interaction with / nodes, Vi is assumed to decay exponentially in time. In 
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other words, is given by 

^) = ^exp(-^J, (1) 

where t e is the time of an event between node % and an / node, and Td is the decay time constant. 
An example time course of V{ (t) is shown in Fig. 1 . 

The model contains two parameters Td and Vth? and can be regarded as a variant of the 
deterministic threshold model [13,14]. Although we assume that all the nodes have the same 
values of Td and v t hr for simplicity, it is straightforward to generalize the model in the case of 
heterogeneous parameter values. 

We simulate our model numerically on empirical temporal networks in the following way. 
At t — 0, we select a node as initial seed and set its state to /. All the other nodes are initially 
in state S. Then, we chronologically read the event sequence one by one and update and 
the states of the two nodes involved in the event. Because our model is deterministic, the final 
infection size (i.e., fraction of I nodes at time t max , where t max is the time of the last event in 
the data set), denoted by I iy is unique for given initial seed i, r d , and v thv . 

We use two data sets. The first data set, called Conference in the following, is the face- 
to-face conversation log between attendees of a scientific conference [18]. The second data set, 
called Email, is the record of e-mail exchanges between the members of a university [19]. In 
the second data set, we neglect the direction of the interaction (i.e., from sender to receiver) 
for simplicity. The basic statistics of the data sets are summarized in Tab. 1. 

3 Results 

In Figs. 2(a) and 2(b), we plot the dependence of final infection size I m on Td and Vthi for 
initial seed node m having the maximum number of events in Conference and Email data sets, 
respectively. In the blank parameter region, no infection occurs such that I m = 1/N. Naturally, 
I m increases with r d and decreases with -y thr . 

Next, we carry out the same set of simulations on the randomized temporal networks for the 
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sake of comparison. To this end, we use the so-called randomly-permuted-times randomization, 
in which the time stamps of all the events are randomly shuffled [2,3,6]. The randomization 
eliminates temporal properties of the original temporal networks such as bursty activity pat- 
terns, daily and weekly patterns, and the pairwise correlations of the IEIs, whereas it conserves 
all the properties of the aggregated networks, i.e., weighted adjacency matrix. 

For the randomized temporal networks, the dependence of I m on r d and f t hr are shown 
in Figs. 2(c) and 2(d) for Conference and Email data sets, respectively. We find that the 
parameter region in which infection occurs is larger for the original temporal networks (colored 
regions in Figs. 2(a) and 2(b)) than for the randomized temporal networks (colored regions in 
Figs. 2(c) and 2(d)) for intermediate values of r d (10 2 < r d < 10 4 and 10 4 < r d < 10 6 for 
Conference and Email data sets, respectively). In the original data sets, the nodes tend to have 
many events in bursty periods and be quiescent in other periods. The randomization procedure 
eliminates bursty activity patterns. Therefore, v m (t) can reach v t ^ r in such a bursty period for 
the original but not randomized temporal networks if r d and v t hr take intermediate values. In 
the randomized data sets, v m (t) tends to decay faster than it grows, although the number of 
events per node is the same between the original and randomized data. 

For Email data set, I m for the randomized data set (Fig. 2(d)) is larger than that for 
the original data set (Fig. 2(c)) when r d is large and f t hr is small. This is mainly because 
the randomization increases the reachability ratio from initial seed m to a large extent. The 
reachability ratio from a node is defined as the fraction of nodes that we can reach from the 
node by tracing the events in the chronological order [20]. If every event can elicit infection, 
which is the case when r d is large and Vthr is small, I m is approximated by the reachability ratio 
from node m. The reachability ratio from node m = 3024 in Email data set is equal to 0.7458 
and 0.9981 for the original and randomized data sets, respectively. In contrast, the reachability 
ratio from node m = 55 in Conference data set is equal to 0.9642 and 1 for the original and 
randomized data sets, respectively; the difference is smaller than in the case of Email data set. 

In Fig. 3, the average final infection size (ij), defined as the average of Zj over all the nodes i, 
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is plotted as a function of Td for two values of t> t h r for each data set. Figure 3 indicates that (ij) 
for the original temporal networks is larger than that for the randomized temporal networks 
for a broad range of Td for both data sets. 

In the bond percolation on static networks, the probability that single bonds are open 
(independent of different bonds) is the sole parameter that determines the possibility that the 
entire network has a giant component [1]. Motivated by this picture, we hypothesize that 
the results shown in Figs. 2 and 3 are largely explained by the bursty nature of events on 
single links. In other words, we speculate that the structure of the aggregated networks or 
correlation between event sequences on different links do not much influence the results. To 
test the hypothesis, we separately examine the event sequence on each link. For each link, i.e., 
node pair (i,j) with at least one event, Tj^i(t) is defined as the time required for node i to be 
infected since node j has been infected. We emphasize that we do not consider influences from 
other nodes on i in this analysis. We take the time average of Tj-n(t), denoted by Tj-n, over 
< t < t max - A problem with the time averaging is that Tj_>i(t) is indefinite for sufficiently 
large t because i does not get infected by time t max . Therefore, we adopt the boundary condition 
in which the first events between nodes % and j virtually replay after t = t max . We denote the 
time of the first event between i and j by t±. If we temporarily set Tj^.j(t max + t±) — Tj-n(ti), 
it takes at most t max — t + 1\ + Tj^i(t\) for node % starting with Vi(t) = to be infected from 
node j, where ti ast <t< t max and ti ast is the last time before which Tj^i(t) is finite. Therefore, 
we set Tj-ti(t) = t max — t + t± + Tj_^(ti) for ti ast < t < t max . This boundary condition is the 
same as that is used in Ref. [21] for defining the average temporal path length. If Tj^i(ti) 
is indefinite (i.e., infection never occurs between i and j), is set to infinite. We define 

denoted by (1/Tj_h) as the average of over the 20% links with the largest numbers of 

events, because the majority of the links possesses a small number of events in both data sets. 
This thresholding leaves 441 and 6,932 links for Conference and Email data sets, respectively. 

(1/Tj-ti) for the original and randomized temporal networks are shown for various Td and 
w t hr values for Conference (Figs. 4(a) and 4(b)) and Email (Figs. 4(c) and 4(d)) data sets. 
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Because infection can be induced only through a single link in the present simulations, we 
examined w t hr values that are much smaller than those used in Figs. 2 and 3. For both data 
sets, (l/Tj^j) for the original temporal networks (Figs. 4(a) and 4(c)) is larger than that for the 
randomized networks (Figs. 4(b) and 4(d)) for intermediate values of (10 2 < < 10 4 and 
10 4 < Ta < 10 6 for Conference and Email data sets, respectively). The behavior of (1/Tj^i) is 
consistent with the results of the network-based simulations (Figs. 2 and 3). 

4 Conclusions 

We numerically simulated a variant of the deterministic threshold model on empirical temporal 
networks. We found that the average final infection size for the empirical temporal networks is 
larger than those for the randomized temporal networks in a broad parameter region (Figs. 2 
and 3). The bursty nature of the IEIs on single links has a sufficient explanatory power for the 
results of the network-based simulations (Fig. 4). The burstiness promoted epidemic spreading 
when the decay exponent takes an intermediate value (10 2 < < 10 4 and 10 4 < ra < 10 6 
(seconds) for Conference and Email data sets, respectively). This range of ra may be practical 
because the influence of a pathogen that an individual has received may last for hours to days. 

The finding that the burstiness facilitates the spreading also sheds light on a function of the 
redundant interaction events. We previously found that about 80% of the events are redundant 
in the sense that they affect little on bridging efficient temporal paths in Conference data 
set [22]. However, for the spreading dynamics in our model, such redundant events play a 
crucial role in increasing Vi(t) within bursty periods. 
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1.05 1.06 1.07 1.08 

Figure 1: ^= (*) for 1.05 x 10 6 < t < 1.08 x 10 6 in Email data set. We set r d = 1000. The 
vertical ticks in the box plot in the bottom indicate the times of the events that involve node 
% = 0. 
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Figure 2: Dependence of the final infection size I m on and t>thr- (a), (b) Original temporal 
networks, (c), (d) Randomized temporal networks, (a), (c) I m =tt in Conference data set. 
(b), (d) / m =3024 in Email data set. No infection occurs in the black parameter regions. The 
parameter values for which at least one infection occurs are colored. 
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Figure 3: Average final infection size for (a, b) Conference and (c, d) Email data sets. 
Squares and circles correspond to the original and randomized temporal networks, respectively. 
We set (a) v thr = 5, (b) v thx = 20, (c) v t hi = 3, and (d) v thl = 10. 
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Figure 4: Average single-link infection rate (1/Tj-n) for (a), (b) Conference and (c), (d) Email 
data sets, (a), (c) Original temporal networks, (b), (d) Randomized temporal networks. 
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Table 1: Statistics of the two data sets. 





Conference 


Email 


Number of nodes (N) 


113 


3,188 


Number of events 


20,808 


309,125 


Recording period 


3 days 


83 days 


Time resolution 


20 sec 


1 sec 
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